Data Science and Statistical Learning Course Final Project.
The Stop, Question and Frisk program is a practice, utilized by New York City Police Department (NYPD), of temporarily halting, questioning, and, in certain cases, searching civilians on the street for weapons and other dangers. It is also called “Terry Stop,” named after the Supreme Court case Terry v. Ohio (1968).
The use of Stop, Question and Frisk practice is often endorsed with the Broken Windows Theory, suggesting that even low-level crimes and civil disorder leads to more serious crimes in urban enviornments. In fact, after NYPD officer Adrian Schoolcraft made extensive recordings on the department’s Stop and Frisk policy, numerous civil rights organizations, such as NYCLU, raised a concern that the program unfairly targest certain minorities, such as African-Americans and Hispanic-Americans.
A major turning point was the 2013 court case Floyd v. City of New York and a subsequent NYPD mandate that requires officers to thoroughly justify the reason for making a stop.[34] In 2013, 191,558 stops were made.[35]
Visualize the data to get intuit
# Samuel's Part
SQF1718 <- read_csv("https://github.com/samuellee19/econ122_sqf/raw/master/Data%20Files%20Only/SQF1718.csv")
# Priyanka's Part
url <- "https://www.nyclu.org/sites/default/files/field_documents/2018_sqf_database.xlsx"
destfile <- "2018_sqf_database.xlsx"
curl::curl_download(url, destfile)
# Create a vector of Excel files to read
files.to.read = list.files(pattern="xlsx")
# Read each file and write it to csv
library("rio")
xls <- dir(pattern = "xlsx")
created <- mapply(convert, xls, gsub("xlsx", "csv", xls))
unlink(xls) # delete xlsx files
SQF2018 <- read.csv("2018_sqf_database.csv")
url <- "https://www1.nyc.gov/assets/nypd/downloads/excel/analysis_and_planning/stop-question-frisk/sqf-2017.xlsx"
destfile <- "2017_sqf_database.xlsx"
curl::curl_download(url, destfile)
# Create a vector of Excel files to read
files.to.read = list.files(pattern="xlsx")
# Read each file and write it to csv
library("rio")
xls <- dir(pattern = "xlsx")
created <- mapply(convert, xls, gsub("xlsx", "csv", xls))
unlink(xls) # delete xlsx files
SQF2017 <- read.csv("2017_sqf_database.csv")
url <- "https://www.nyclu.org/sites/default/files/field_documents/2016_sqf_database.xlsx"
destfile <- "2016_sqf_database.xlsx"
curl::curl_download(url, destfile)
# Create a vector of Excel files to read
files.to.read = list.files(pattern="xlsx")
# Read each file and write it to csv
library("rio")
xls <- dir(pattern = "xlsx")
created <- mapply(convert, xls, gsub("xlsx", "csv", xls))
unlink(xls) # delete xlsx files
SQF2016 <- read.csv("2016_sqf_database.csv")
url <- "https://www1.nyc.gov/assets/nypd/downloads/excel/analysis_and_planning/stop-question-frisk/sqf-2015.csv"
destfile <- "sqf-2015.csv"
curl::curl_download(url, destfile)
SQF2015 <- read.csv(destfile)
# Graphing standard deviation of crime count vs standard deviation number of stop and frisks by year, and then coloring by year
# No trend in data at all
ggplot(crimeAndCount, mapping = aes(x = crimeChange, y = count)) + geom_point(mapping=aes(color = year)) + geom_smooth(method="lm", color="red", aes(x = crimeChange, y = count))

# Seperating by year in case poilicy changes had an effect, no clear trend still
ggplot(crimeAndCount, mapping = aes(x = crimeChange, y = count)) + geom_point() + geom_smooth(method="lm", color="red", aes(x = crimeChange, y = count)) + facet_wrap(~year)

# Graphing standard deviation of crime count vs standard deviation number of stop and frisks by year, and then coloring by year
# No trend in data at all
ggplot(crimeAndCount, mapping = aes(x = crimeSD, y = count)) + geom_point(mapping=aes(color = year)) + geom_smooth(method="lm", color="red", aes(x = crimeSD, y = count))

# Seperating by year in case policy changes had an effect, no clear trend still
ggplot(crimeAndCount, mapping = aes(x = crimeSD, y = count)) + geom_point() + geom_smooth(method="lm", color="red", aes(x = crimeSD, y = count)) + facet_wrap(~year)
